{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "WfGpInivS0fG"
   },
   "source": [
    "<h2 align=\"center\">点击下列图标在线运行HanLP</h2>\n",
    "<div align=\"center\">\n",
    "\t<a href=\"https://colab.research.google.com/github/hankcs/HanLP/blob/doc-zh/plugins/hanlp_demo/hanlp_demo/zh/sdp_stl.ipynb\" target=\"_blank\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "\t<a href=\"https://mybinder.org/v2/gh/hankcs/HanLP/doc-zh?filepath=plugins%2Fhanlp_demo%2Fhanlp_demo%2Fzh%2Fsdp_stl.ipynb\" target=\"_blank\"><img src=\"https://mybinder.org/badge_logo.svg\" alt=\"Open In Binder\"/></a>\n",
    "</div>\n",
    "\n",
    "## 安装"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nf9TgeCTC0OT"
   },
   "source": [
    "无论是Windows、Linux还是macOS，HanLP的安装只需一句话搞定："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "jaW4eu6kC0OU",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "!pip install hanlp -U"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_xI_bLAaC0OU"
   },
   "source": [
    "## 加载模型\n",
    "HanLP的工作流程是先加载模型，模型的标示符存储在`hanlp.pretrained`这个包中，按照NLP任务归类。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "IYwV-UkNNzFp",
    "outputId": "54065443-9b0a-444c-f6c0-c701bc86400b",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'SEMEVAL16_NEWS_BIAFFINE_ZH': 'https://file.hankcs.com/hanlp/sdp/semeval16-news-biaffine_20191231_235407.zip',\n",
       " 'SEMEVAL16_TEXT_BIAFFINE_ZH': 'https://file.hankcs.com/hanlp/sdp/semeval16-text-biaffine_20200101_002257.zip',\n",
       " 'SEMEVAL16_ALL_ELECTRA_SMALL_ZH': 'https://file.hankcs.com/hanlp/sdp/semeval16_sdp_electra_small_20220208_122026.zip',\n",
       " 'SEMEVAL15_PAS_BIAFFINE_EN': 'https://file.hankcs.com/hanlp/sdp/semeval15_biaffine_pas_20200103_152405.zip',\n",
       " 'SEMEVAL15_PSD_BIAFFINE_EN': 'https://file.hankcs.com/hanlp/sdp/semeval15_biaffine_psd_20200106_123009.zip',\n",
       " 'SEMEVAL15_DM_BIAFFINE_EN': 'https://file.hankcs.com/hanlp/sdp/semeval15_biaffine_dm_20200106_122808.zip'}"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import hanlp\n",
    "hanlp.pretrained.sdp.ALL # 语种见名称最后一个字段或相应语料库"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1Uf_u7ddMhUt",
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "调用`hanlp.load`进行加载，模型会自动下载到本地缓存。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "id": "pp-1KqEOOJ4t",
    "pycharm": {
     "name": "#%%\n"
    }
   },
   "outputs": [],
   "source": [
    "sdp = hanlp.load('SEMEVAL16_ALL_ELECTRA_SMALL_ZH')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "elA_UyssOut_"
   },
   "source": [
    "## 语义依存分析\n",
    "语义依存分析的输入为已分词的一个或多个句子："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "id": "BqEmDMGGOtk3"
   },
   "outputs": [],
   "source": [
    "graph = sdp([\"2021年\", \"HanLPv2.1\", \"为\", \"生产\", \"环境\", \"带来\", \"次\", \"世代\", \"最\", \"先进\", \"的\", \"多\", \"语种\", \"NLP\", \"技术\", \"。\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "SwaPn1hjC0OW"
   },
   "source": [
    "返回对象为[CoNLLSentence](https://hanlp.hankcs.com/docs/api/common/conll.html#hanlp_common.conll.CoNLLSentence)类型："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "egpWwHKxC0OX",
    "outputId": "f7c77687-dd75-4fa2-dbd2-be6bda8a3fff"
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'id': 1,\n",
       "  'form': '2021年',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(6, 'Time')],\n",
       "  'misc': None},\n",
       " {'id': 2,\n",
       "  'form': 'HanLPv2.1',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(6, 'Exp')],\n",
       "  'misc': None},\n",
       " {'id': 3,\n",
       "  'form': '为',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(5, 'mPrep')],\n",
       "  'misc': None},\n",
       " {'id': 4,\n",
       "  'form': '生产',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(5, 'Desc')],\n",
       "  'misc': None},\n",
       " {'id': 5,\n",
       "  'form': '环境',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(6, 'Datv')],\n",
       "  'misc': None},\n",
       " {'id': 6,\n",
       "  'form': '带来',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(2, 'eSucc')],\n",
       "  'misc': None},\n",
       " {'id': 7,\n",
       "  'form': '次',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(8, 'Desc'), (13, 'Desc')],\n",
       "  'misc': None},\n",
       " {'id': 8,\n",
       "  'form': '世代',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(0, 'Root'), (15, 'Time')],\n",
       "  'misc': None},\n",
       " {'id': 9,\n",
       "  'form': '最',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(10, 'mDegr')],\n",
       "  'misc': None},\n",
       " {'id': 10,\n",
       "  'form': '先进',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(15, 'Desc')],\n",
       "  'misc': None},\n",
       " {'id': 11,\n",
       "  'form': '的',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(10, 'mAux')],\n",
       "  'misc': None},\n",
       " {'id': 12,\n",
       "  'form': '多',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(10, 'mDegr'), (13, 'Quan')],\n",
       "  'misc': None},\n",
       " {'id': 13,\n",
       "  'form': '语种',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(15, 'Desc')],\n",
       "  'misc': None},\n",
       " {'id': 14,\n",
       "  'form': 'NLP',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(15, 'Desc')],\n",
       "  'misc': None},\n",
       " {'id': 15,\n",
       "  'form': '技术',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(6, 'Pat')],\n",
       "  'misc': None},\n",
       " {'id': 16,\n",
       "  'form': '。',\n",
       "  'upos': None,\n",
       "  'xpos': None,\n",
       "  'head': None,\n",
       "  'deprel': None,\n",
       "  'lemma': None,\n",
       "  'feats': None,\n",
       "  'deps': [(6, 'mPunc')],\n",
       "  'misc': None}]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "graph"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kq_j5TLFC0OX"
   },
   "source": [
    "打印为为CoNLL格式："
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "isJhzYyIC0OX",
    "outputId": "683c8489-dffc-426e-f95b-e91dfb373260"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\t2021年\t_\t_\t_\t_\t_\t_\t6:Time\t_\n",
      "2\tHanLPv2.1\t_\t_\t_\t_\t_\t_\t6:Exp\t_\n",
      "3\t为\t_\t_\t_\t_\t_\t_\t5:mPrep\t_\n",
      "4\t生产\t_\t_\t_\t_\t_\t_\t5:Desc\t_\n",
      "5\t环境\t_\t_\t_\t_\t_\t_\t6:Datv\t_\n",
      "6\t带来\t_\t_\t_\t_\t_\t_\t2:eSucc\t_\n",
      "7\t次\t_\t_\t_\t_\t_\t_\t8:Desc|13:Desc\t_\n",
      "8\t世代\t_\t_\t_\t_\t_\t_\t0:Root|15:Time\t_\n",
      "9\t最\t_\t_\t_\t_\t_\t_\t10:mDegr\t_\n",
      "10\t先进\t_\t_\t_\t_\t_\t_\t15:Desc\t_\n",
      "11\t的\t_\t_\t_\t_\t_\t_\t10:mAux\t_\n",
      "12\t多\t_\t_\t_\t_\t_\t_\t10:mDegr|13:Quan\t_\n",
      "13\t语种\t_\t_\t_\t_\t_\t_\t15:Desc\t_\n",
      "14\tNLP\t_\t_\t_\t_\t_\t_\t15:Desc\t_\n",
      "15\t技术\t_\t_\t_\t_\t_\t_\t6:Pat\t_\n",
      "16\t。\t_\t_\t_\t_\t_\t_\t6:mPunc\t_\n"
     ]
    }
   ],
   "source": [
    "print(graph)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "name": "sdp_stl.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
